healthiar R package workshop 1 for WP5
2025-03-24
Let’s start by checking out the BEST-COST GitHub repo and the README file
The README file contains the following information
healthiar R packagehealthiarThe main function family of healthiar is the attribute family, used to attribute health impacts to a (environmental) risk factor, e.g. noise or air pollution. The core function attribute_health can be used for most assessments.
Figure: healthiar overview
healthiar in RStudio 1/3RStudio startup screen
healthiar in RStudio 2/3Post installation, you can access the healthiar package landing page in RStudio by going to the Packages tab and then clicking on the healthiar package.
healthiar in RStudio 3/3note: first screenshot of full screen note: mention vignette opens as pdf or html
Landing page of the healthiar package in RStudio, where you find the package vignettes and function documentation.
healthiar in RStudio - Vignette (3/3)Intro vignette. You can open the intro_to_healthiar vignette within RStudio or as a HTML within your browser.
Tip
The vignette introduces healthiar step-by-step and contains (reproducible) examples.
Note
The vignette is a work in progress: we appreciate any feedback or suggestions you might have to make it more useful to future users!
note: show fun doc of attribute_healthiar
note: quickly introduce the main workflow of burden assessments
note: Quickly show function families
attribute call without input uncertaintiesGoal: attribute COPD cases to air pollution
Tip
healthiar comes with some example data that start with exdat_ that allow you to test functions.
results_pm_copd <-
healthiar::attribute_health(
erf_shape = "log_linear",
rr_central = exdat_pm_copd$relative_risk,
rr_lower = exdat_pm_copd$relative_risk_lower,
rr_upper = exdat_pm_copd$relative_risk_upper,
rr_increment = 10,
exp_central = exdat_pm_copd$mean_concentration,
cutoff_central = exdat_pm_copd$cut_off_value,
bhd_central = exdat_pm_copd$incidents_per_100_000_per_year/1E5*exdat_pm_copd$population_at_risk,
# bhd_central = exdat_pm_copd$incidence # Uncomment once change committed to main
) Every attribute output consists of two lists (“folders”)
health_main contains the main results
health_detailed detailed results (and in some cases even more information about the assessment/calculation)
note: add here that format is tibble (but can be changed to data frame)
note: screenshot of how to click on variable in env (put that as first option) . . .
Tip
This is personal preference! However, you might encounter all options.
Go to the Environment tab in RStudio and click on a variable to “open” it. Alternatively, you can use View(results_pm_copd), which has the same effect.
results_pm_copd$health_main$impact_rounded
Note: after typing the $ sign you can see all available options by pressing the tab key and use the arrows & tab keys to select an option (or alternatively use the mouse)
results_pm_copd[["health_main"]]
Note: if the cursor is located within the square braces you can see all available options by pressing the tab key
Using the purrr::pluck function to select a list and then the dplyr::pull function extract values from a specified column
results_pm_copd |> purrr::pluck("health_main") |> dplyr::pull("impact_rounded")
Note: available options can’t be displayed automatically using these functions -> better suited for a more permanent analysis script
| exp | bhd | rr | erf_ci | pop_fraction | impact_rounded |
|---|---|---|---|---|---|
| 8.85 | 30747 | 1.369 | central | 0.1138961 | 3502 |
| 8.85 | 30747 | 1.124 | lower | 0.0440064 | 1353 |
| 8.85 | 30747 | 1.664 | upper | 0.1780300 | 5474 |
attribute with input uncertaintiesGoal: attribute lung cancer deaths to PM2.5 exposure
Tip
See the intro vignette for a detailed description of output columns.
The health_detailed output table contains all different combinations of the arguments with uncertainty. E.g. rr_central with exp_lower and bhd_upper, …
| geo_id_disaggregated | erf_ci | exp_ci | bhd_ci | cutoff_ci | pop_fraction | impact | prop_pop_exp | rr_increment | erf_shape | exposure_name | approach_risk | health_outcome | exposure_dimension | exposure_type | exp | rr | bhd | cutoff | pop_fraction_type | rr_conc | impact_rounded |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | central | central | central | central | 0.1138961 | 3501.9619 | 1 | 10 | log_linear | NA | relative_risk | same_input_output | 1 | population_weighted_mean | 8.85 | 1.369 | 30747 | 5 | paf | 1.128536 | 3502 |
| 1 | central | central | lower | central | 0.1138961 | 3189.0894 | 1 | 10 | log_linear | NA | relative_risk | same_input_output | 1 | population_weighted_mean | 8.85 | 1.369 | 28000 | 5 | paf | 1.128536 | 3189 |
| 1 | central | central | upper | central | 0.1138961 | 3644.6736 | 1 | 10 | log_linear | NA | relative_risk | same_input_output | 1 | population_weighted_mean | 8.85 | 1.369 | 32000 | 5 | paf | 1.128536 | 3645 |
| 1 | lower | central | central | central | 0.0440064 | 1353.0658 | 1 | 10 | log_linear | NA | relative_risk | same_input_output | 1 | population_weighted_mean | 8.85 | 1.124 | 30747 | 5 | paf | 1.046032 | 1353 |
| 1 | lower | central | lower | central | 0.0440064 | 1232.1801 | 1 | 10 | log_linear | NA | relative_risk | same_input_output | 1 | population_weighted_mean | 8.85 | 1.124 | 28000 | 5 | paf | 1.046032 | 1232 |
| 1 | lower | central | upper | central | 0.0440064 | 1408.2058 | 1 | 10 | log_linear | NA | relative_risk | same_input_output | 1 | population_weighted_mean | 8.85 | 1.124 | 32000 | 5 | paf | 1.046032 | 1408 |
| 1 | upper | central | central | central | 0.1780300 | 5473.8882 | 1 | 10 | log_linear | NA | relative_risk | same_input_output | 1 | population_weighted_mean | 8.85 | 1.664 | 30747 | 5 | paf | 1.216589 | 5474 |
| 1 | upper | central | lower | central | 0.1780300 | 4984.8398 | 1 | 10 | log_linear | NA | relative_risk | same_input_output | 1 | population_weighted_mean | 8.85 | 1.664 | 28000 | 5 | paf | 1.216589 | 4985 |
| 1 | upper | central | upper | central | 0.1780300 | 5696.9598 | 1 | 10 | log_linear | NA | relative_risk | same_input_output | 1 | population_weighted_mean | 8.85 | 1.664 | 32000 | 5 | paf | 1.216589 | 5697 |
| 1 | central | lower | central | central | 0.0899213 | 2764.8092 | 1 | 10 | log_linear | NA | relative_risk | same_input_output | 1 | population_weighted_mean | 8.00 | 1.369 | 30747 | 5 | paf | 1.098806 | 2765 |
| 1 | central | lower | lower | central | 0.0899213 | 2517.7955 | 1 | 10 | log_linear | NA | relative_risk | same_input_output | 1 | population_weighted_mean | 8.00 | 1.369 | 28000 | 5 | paf | 1.098806 | 2518 |
| 1 | central | lower | upper | central | 0.0899213 | 2877.4806 | 1 | 10 | log_linear | NA | relative_risk | same_input_output | 1 | population_weighted_mean | 8.00 | 1.369 | 32000 | 5 | paf | 1.098806 | 2877 |
| 1 | lower | lower | central | central | 0.0344604 | 1059.5528 | 1 | 10 | log_linear | NA | relative_risk | same_input_output | 1 | population_weighted_mean | 8.00 | 1.124 | 30747 | 5 | paf | 1.035690 | 1060 |
| 1 | lower | lower | lower | central | 0.0344604 | 964.8902 | 1 | 10 | log_linear | NA | relative_risk | same_input_output | 1 | population_weighted_mean | 8.00 | 1.124 | 28000 | 5 | paf | 1.035690 | 965 |
| 1 | lower | lower | upper | central | 0.0344604 | 1102.7316 | 1 | 10 | log_linear | NA | relative_risk | same_input_output | 1 | population_weighted_mean | 8.00 | 1.124 | 32000 | 5 | paf | 1.035690 | 1103 |
| 1 | upper | lower | central | central | 0.1416706 | 4355.9450 | 1 | 10 | log_linear | NA | relative_risk | same_input_output | 1 | population_weighted_mean | 8.00 | 1.664 | 30747 | 5 | paf | 1.165054 | 4356 |
| 1 | upper | lower | lower | central | 0.1416706 | 3966.7760 | 1 | 10 | log_linear | NA | relative_risk | same_input_output | 1 | population_weighted_mean | 8.00 | 1.664 | 28000 | 5 | paf | 1.165054 | 3967 |
| 1 | upper | lower | upper | central | 0.1416706 | 4533.4583 | 1 | 10 | log_linear | NA | relative_risk | same_input_output | 1 | population_weighted_mean | 8.00 | 1.664 | 32000 | 5 | paf | 1.165054 | 4533 |
| 1 | central | upper | central | central | 0.1453304 | 4468.4726 | 1 | 10 | log_linear | NA | relative_risk | same_input_output | 1 | population_weighted_mean | 10.00 | 1.369 | 30747 | 5 | paf | 1.170043 | 4468 |
| 1 | central | upper | lower | central | 0.1453304 | 4069.2501 | 1 | 10 | log_linear | NA | relative_risk | same_input_output | 1 | population_weighted_mean | 10.00 | 1.369 | 28000 | 5 | paf | 1.170043 | 4069 |
| 1 | central | upper | upper | central | 0.1453304 | 4650.5716 | 1 | 10 | log_linear | NA | relative_risk | same_input_output | 1 | population_weighted_mean | 10.00 | 1.369 | 32000 | 5 | paf | 1.170043 | 4651 |
| 1 | lower | upper | central | central | 0.0567717 | 1745.5580 | 1 | 10 | log_linear | NA | relative_risk | same_input_output | 1 | population_weighted_mean | 10.00 | 1.124 | 30747 | 5 | paf | 1.060189 | 1746 |
| 1 | lower | upper | lower | central | 0.0567717 | 1589.6063 | 1 | 10 | log_linear | NA | relative_risk | same_input_output | 1 | population_weighted_mean | 10.00 | 1.124 | 28000 | 5 | paf | 1.060189 | 1590 |
| 1 | lower | upper | upper | central | 0.0567717 | 1816.6929 | 1 | 10 | log_linear | NA | relative_risk | same_input_output | 1 | population_weighted_mean | 10.00 | 1.124 | 32000 | 5 | paf | 1.060189 | 1817 |
| 1 | upper | upper | central | central | 0.2247829 | 6911.4001 | 1 | 10 | log_linear | NA | relative_risk | same_input_output | 1 | population_weighted_mean | 10.00 | 1.664 | 30747 | 5 | paf | 1.289961 | 6911 |
| 1 | upper | upper | lower | central | 0.2247829 | 6293.9214 | 1 | 10 | log_linear | NA | relative_risk | same_input_output | 1 | population_weighted_mean | 10.00 | 1.664 | 28000 | 5 | paf | 1.289961 | 6294 |
| 1 | upper | upper | upper | central | 0.2247829 | 7193.0531 | 1 | 10 | log_linear | NA | relative_risk | same_input_output | 1 | population_weighted_mean | 10.00 | 1.664 | 32000 | 5 | paf | 1.289961 | 7193 |
attribute with exposure categories (noise)Goal: attribute cases of high annoyance (HA) to noise exposure
exdat_noise_ha <-
exdat_noise_ha |>
dplyr::filter(!is.na(exdat_noise_ha$exposure_mean))
results_noise_ha <-
healthiar::attribute_health(
approach_risk = "absolute_risk",
exp_central = c(57.5, 62.5, 67.5, 72.5, 77.5),
population = sum(exdat_noise_ha$population_exposed_total), # TO DO: hard code input here
prop_pop_exp = exdat_noise_ha$population_exposed_total /
sum(exdat_noise_ha$population_exposed_total),
erf_eq_central = "78.9270-3.1162*c+0.0342*c^2")| geo_id_disaggregated | erf_ci | exp_ci | population | prop_pop_exp | exposure_name | approach_risk | health_outcome | exposure_dimension | exposure_type | exp | erf_eq | absolute_risk_as_percent | pop_exp | impact | impact_rounded | impact_per_100k_inhab |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | central | central | 945200 | 0.4099661 | NA | absolute_risk | same_input_output | 1 | exposure_distribution | 57.5 | 78.9270-3.1162c+0.0342c^2 | 12.81925 | 387500 | 49674.594 | 49675 | 5255.4585 |
| 1 | central | central | 945200 | 0.3025815 | NA | absolute_risk | same_input_output | 2 | exposure_distribution | 62.5 | 78.9270-3.1162c+0.0342c^2 | 17.75825 | 286000 | 50788.595 | 50789 | 5373.3173 |
| 1 | central | central | 945200 | 0.2029200 | NA | absolute_risk | same_input_output | 3 | exposure_distribution | 67.5 | 78.9270-3.1162c+0.0342c^2 | 24.40725 | 191800 | 46813.105 | 46813 | 4952.7196 |
| 1 | central | central | 945200 | 0.0763860 | NA | absolute_risk | same_input_output | 4 | exposure_distribution | 72.5 | 78.9270-3.1162c+0.0342c^2 | 32.76625 | 72200 | 23657.232 | 23657 | 2502.8811 |
| 1 | central | central | 945200 | 0.0081464 | NA | absolute_risk | same_input_output | 5 | exposure_distribution | 77.5 | 78.9270-3.1162c+0.0342c^2 | 42.83525 | 7700 | 3298.314 | 3298 | 348.9541 |
note: definitely include because WP5 will do
Check out the intro vignette for examples (to be added)
correlated exposures
life table analysis
get_daly
note: show how to export csv, …
Out of scope of healthiar
Using R, e.g. with ggplot2 package
Using Excel (once results are exported)
note: stress that they must install & test healthiar before 2nd WS
note: AC: not more than 2-3 exercises, one with provided numbers, one where they use exdat
note: put exercises in PDF on Teams
If you encounter challenges during installation, get in touch with us!
note: mention programme WS2 & explicitly YOU WILL HAVE TO PROGRAMME IN RSTUDIO USING HEALTHIAR
note: document any suggestions for improvements
Social analysis
note: if time allows